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Abstract 

Model-based object recognition commonly involves using a minimal set of matched model and 
image points to compute the pose of the model in image coordinates. Furthermore, recognition 
systems often rely on the “weak-perspective” imaging model in place of the perspective imaging 
model. This paper discusses computing the pose of a model from three corresponding points 
under weak-perspective projection. A new solution to the problem is proposed which, like 
previous solutions, involves solving a biquadratic equation. Here the biquadratic is motivated 
geometrically and its solutions, comprised of an actual and a false solution, are interpreted 
graphically. The final equations take a new form, which lead to a simple expression for the 
image position of any unmatched model point. 
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1 Introduction 


Recognizing an object generally requires finding correspondences between features of a model 
and an image. Since finding corresponding features often requires trying all possible correspon¬ 
dences, recognition systems frequently use correspondences between minimal sets of features 
to compute poses of the model. For instance, “alignment” techniques repeatedly hypothesize 
correspondences between minimal sets of model and image features, and then use those corre¬ 
spondences to compute model poses, which are used to find other model-image correspondences 
(e.g., [5], [10], [1], [9], [28], [29], [15], [3], [16]-[18], [30], [19]). In addition, “pose clustering” 
techniques use every correspondence between a minimal set of model and image features to 
compute a model pose, and then count the number of times each pose is repeated (e.g., [2], 
[26], [25], [23], [11], [4]). 

For computing poses of 3D objects from 2D images, a model of projection must be selected, 
and typically either perspective or “weak-perspective” projection is chosen. Weak-perspective 
projection is an orthographic projection plus a scaling, which serves to approximate perspective 
projection by assuming that all points on a 3D object are at roughly the same distance from the 
camera. For both perspective and weak-perspective projections, the minimal number of points 
needed to compute a model pose up to a finite number of solutions is three ([10], [18]). For 
point features, then, the problem is to determine the pose of three points in space given three 
corresponding image points. When perspective projection is the imaging model, the problem 
is known as the “perspective three-point problem” [10]. When weak-perspective is used, I shall 
call the problem the “weak-perspective three-point problem.” 

A few methods for solving the weak-perspective three-point problem have been suggested in 
the past ([20], [8], [17], [18], [12]), and this paper proposes a new method (solution). The major 
differences with the new solution is that it motivates and explains the solution geometrically, 
and it does not compute a model-to-image transformation as an intermediate step. As will be 
demonstrated later, understanding the geometry is useful for seeing under which circumstances 
the solution simplifies or breaks, and for analyzing where the solution is stable. Furthermore, 
a geometric understanding may be useful for seeing how the solution is affected by error in the 
image and the model. 

In addition to providing a geometric interpretation, the solution in this paper gives direct 
expressions for the three matched model points in image coordinates, as well as an expression 
for the position in the image of any additional, unmatched model point. Earlier methods all 
require the intermediate computation of a model-to-image transformation. This is meaningful 
because, as mentioned above, many alignment-based recognition systems calculate the 3D pose 
solution many times while searching for the correct pose of the model. Consequently, avoiding 
the intermediate calculation of the transformation could cause such systems to run faster. 

To illustrate how significant such a speed-up can be, consider a system that performs 3D 
recognition by alignment using point features to generate hypotheses. The input to the system 
is a model and an image, and the goal is to identify all instances of the model in the image. 
The model is specified by a set of 3D points that can be detected reliably in images, along with 
any number of extended features whose projections can be predicted using points (e.g., line 
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segments, some sets of curves, and edges represented point-by-point). From the image, a set of 
2D points is extracted by a low-level process that looks for points of the type corresponding to 
points in the model. The alignment algorithm proceeds as follows: 

1. Hypothesize a correspondence between three model points and three image points. 

2. Compute the 3D pose of the model from the three-point correspondence. 

3. Predict the image positions of the remaining model points and extended features using 
the 3D pose. 

4. Verify whether the hypothesis is correct by looking in the image near the predicted posi¬ 
tions of the model features for corresponding image features. 

This process is repeated until all pairs of triples of model and image (sensed) points have been 
tried. For m model points and s sensed points, there are (™)(g)3! distinct pairs of model and 
image point triples. Consequently, the running time for the algorithm grows with the cubes 
of the numbers of model and image points. Since these numbers can be large, the model and 
image points typically are grouped in advance so that only triples of points from the groups 
have to be tried (e.g., [23], [21], [18]). This can bring the number of pairs of triples into a 
range where the algorithm is practical. Then a “constant-times” speed-up in the innermost 
loop of the algorithm, that is, steps 2-4 hsted above, could give a substantial improvement in 
the overall execution time. As already suggested, the solution given in this paper should make 
steps 2 and 3 significantly faster. 

From observing previous solutions, the solution given in this paper most resembles Ullman’s 
([28], [17]), in that both end up having to solve the same biquadratic equation, although each 
derives the biquadratic differently. In this sense, the solution given here is an extension of 
Ullman’s, because, unlike Ullman’s solution, it resolves which of the two non-equivalent solutions 
to the biquadratic is correct. In addition, this paper explains graphically why the two solutions 
arise and to what geometry each corresponds. 

There is an intrinsic geometry that underlies the perspective three-point problem; it is 
shown in Fig. 1. In the figure, the three model points, mo, mi, and rh 2 , are being perspectively 
projected onto three image points, io , U, and * 2 , via lines through the center of projection (center 
point), p. The task is to recover mo, mi, and rh 2 . The essential information is contained in the 
side lengths and angles of the surrounding tetrahedron. 

Similar to the perspective case, there is an intrinsic geometry underlying the weak-perspec¬ 
tive three-point problem, shown in Fig. 2. The picture shows the three model points being 
projected orthographically onto the plane that contains too and is parallel to the image plane, 
and then shows them being scaled down into the image. In addition, the picture shows the 
model points first being scaled down and then projected onto the image plane. In each case, 
the projection is represented by a solid with right angles as shown. The smaller solid is a 
scaled-down version of the larger. The relevant information consists of the side lengths of the 
solids and the scale factor. 

In what follows, first the perspective case is discussed (Section 2). Then I summarize 
how to compute 3D pose from three corresponding points under weak-perspective projection 
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(Section 3). Third the 3D pose solution is shown to exist and be unique, and a geometrical 
interpretation is provided (Section 4). Next a direct expression is derived for the image position 
of an unmatched model point (Section 5). Then I review earlier solutions to the problem and 
present the three most related solutions in detail (Sections 7 and 8). In addition, the new and 
earlier solutions are examined and compared in terms of their stabilities (Sections 6 and 9). 

2 The Perspective Solution 

To see the difference between the perspective and weak-perspective cases, first let us observe 
exactly wliat is required for the perspective tliree-point problem. As pictured in Fig. 1, I will 
work in camera-centered coordinates with the center point at the origin and the line of sight 
along the s axis. The distances f?oi, Ro 2 , and R \2 come from the original, untransformed model 
points. The angles and #12 can be computed from the positions of the image points, 

the focal length, and the center point. To see this, let / equal the focal length, and let the 
image points io, *i, *2 be extended as follows: (x,y) — (x,y,f). Then 

COS 6*01 = *o ' *1, COS 0 Q 2 = *0 • * 2 , cos #12 = *i • i'2i (1) 
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where in general v denotes the unit vector in the direction of v. The problem is to determine 


a, b, and c given i?oi, R 02 , Ri 2 , cos # 01 5 cos # 02 , and cos # 12 . 
law of cosines that 

From the picture, we see by the 

a 2 + b 2 — 2 ab cos # 0 i = i?oi 

(2) 

a 2 + c 2 — 2 ac cos 0 O 2 = R 02 

(3) 

b 2 + c 2 — 26c cos 612 = R \2 

(4) 


Over time, there have been many solutions to the problem, all of which start with the above 
equations. The solutions differ in how they manipulate the equations when solving for the 
unknowns. Recently, Haralick et al. reviewed the various solutions and examined their stabili¬ 
ties [13]. 

Given a, b, and c, we easily can compute the 3D locations of the model points: 

m 0 = aio, rhi = bi\, m 2 = c*2- (5) 

If a 3D rigid transformation is desired, it can be determined from the original 3D model points 
and the 3D camera-centered model points just computed. A simple method for doing so is 
given in Appendix A; for a least-squares solution, see Horn [14]. 

Although perspective (central) projection is a more accurate model, numerous researchers 
have used weak-perspective projection instead (e.g., [24], [20], [7], [8], [25], [28], [29], [21], [22], 
[16]-[18], [3], [30], [19], [12]). The justification for using weak-perspective is that in many cases 
it approximates perspective closely. In particular, for many imaging situations if the size of the 
model in depth (distance in z) is small compared to the depth of the model centroid, then the 
difference should be negligible [25]. 

There are some advantages to using weak-perspective instead of perspective. In particular, 
computations involving weak-perspective often are less complicated. In addition, the weak- 
perspective math model is conceptually simpler, since it uses orthographic instead of perspective 
projection. Another advantage is that we do not need to know the camera focal length or center 
point. Furthermore, there are fewer solutions to deal with—four for perspective and two for 
weak-perspective ([10], [18]). It should be understood, however, that finding two solutions 
instead of four is only an advantage if the four solutions actually collapse to two; otherwise, at 
least two of the solutions are missed. 

Lastly, the weak-perspective imaging model can be used without modification to recognize 
scaled versions of the same object, since the built-in scale factor incorporates object scale. For 
perspective to handle scale, an additional scale parameter must be used. On the other hand, 
weak-perspective is unable distinguish objects that differ only in size, since a smaller scale could 
mean the object is smaller or further away. Nonetheless, in cases where the weak-perspective 
approximation applies, the perspective solution may be unstable in distinguishing different¬ 
sized objects ([27], [29], [18]). In these cases, moving the object further out in depth, that is, 
past the point where perspective and weak-perspective projections are essentially equivalent, 
will have the same effect in the image as uniformly scaling the object down in size. Since the 
perspective solution always distinguishes the depth and size of the object, this suggests that 
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small variations in the image could lead to very different interpretations for the size as weU as 
the depth. 

In sum, there are significant advantages to using weak-perspective in place of perspective, 
and under many viewing conditions the weak-perspective approximation is close to perspective. 
As suggested in the introduction, for these situations it would be useful to know how to solve, 
using weak-perspective projection, the problems of recovering the 3D pose of a model and 
computing the image position of a fourth model point. 

3 Computing the Weak-Perspective Solution 

This section provides a summary of the results I will derive in the next two sections. Specifically, 
it tells how to compute the locations of the three matched model points and the image location 
of any additional, unmatched model point. 

For reference, the geometry underlying weak-perspective projection between three corre¬ 
sponding points, which was described in the introduction, is shown in Fig. 2. All that is 
pertinent to recovering the 3D pose of the model are the distances between the model and 
image points. Let the distances between the model points be (f?oi, Ro2, R12), and the corre¬ 
sponding distances between the image points be (doi, do 2 , di 2 ). Then the parameters of the 
geometry in Fig. 2 are 

b + \/b 2 — ac 
a 

(hi,h 2 ) = ± J(sR 01 ) 2 - <P Q1 , aiJ(sR 02 ) 2 - d 2 ^j 
(Hi,H 2 ) = —(h\,h 2 ) 

where 

a = (i?01 + R02 + Rl2)( — Roi + R02 + -R12X-R0I — R02 + -R12X-R0I + Ro2 ~ R12) 
b = dciX-ifoi + -^02 + R12) + ^02(^01 - Rq 2 + R\ 2 ) + ^12(^01 + R02 ~ R12) 

C = (doi + c?02 + di 2 )( — doi + c?02 + ^ 12 X ^01 — ^02 + ^i2)(^oi + do 2 — d\ 2 ) 

1 if d 2 m + d 2 m _ d 2 i2 < s 2 (R 2 oi + R 2 q2 _ R 2 i2h 
— 1 otherwise. 

As the equations show, the solution has a two-way ambiguity except when hi and h 2 are zero. 
The ambiguity corresponds to a reflection about a plane parallel to the image plane. When 
hi = h 2 = 0, the model triangle (the triangle defined by the three model points) is parallel to 
the image triangle (the triangle defined by the three image points). As a note, a and c measure 
sixteen times the squares of the areas of the model and image triangles, respectively. Further, 
the solution fails when the model triangle degenerates to a line, in which case a = 0; in fact, 
this is the only instance in which a solution may not exist (for a discussion of this case, see 
Section 4.5). Note, however, that no such restriction is placed on the image triangle; so the 
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Figure 2: Model points ??To, nq, and m- 2 undergoing orthographic 
to produce image points io, q, and i 2 . 

projection plus scale 


image points may be collinear. Even so, care should be taken since the solution may be unstable 
when image points are collinear, when the model points are collinear, or when one of the sides 
of the model triangle is parallel to the image plane (see Section 6 ). 

Next, I give an expression for the image location of a fourth model point. Originally, the 
models points are in some arbitrary model coordinate frame. Also, the image points are in 
a camera-centered coordinate frame in which the image serves as the x-y plane. Denote the 
original, untransformed model points by pi, to distinguish them from the camera-centered model 
points in; shown in Fig. 2. Using po, pj, and p 2 , solve the following vector equation for the 
“extended affine coordinates,” (a, /J, 7 ), of pq| 

P 3 = a(pl - po) + /3(f 2 ~Po) + 7(IN - Po) X (f 2 - Po) + Po (13) 

Given image points *o = (i'o, yo), *1 = (aq, y\ ), and i 2 = (x- 2 , y- 2 ), let 

a -'01 aq a*o, Voi Vi Vo, 

x 02 = x 2 — r 0 , y 02 = y 2 — yo- 
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Then the image location of the transformed and projected p% is 

(ax or + fix 02 + 'y(y 0 iH 2 - y 02 Hi) + * 0 , ay 01 + (3y 02 + 'y(-x 01 H 2 + x 02 H 1 ) + y 0 ). (14) 

Lastly, the weak-perspective solution can be used to compute the 3D locations of the model 
points in camera-centered coordinates: 


too = 

-(x 0 ,y 0 ,w) 

s 

(15) 

raj = 

~(xi,yi,hi + w) 

(16) 

m 2 = 

~^(x 2 ,y 2 ,h 2 + w), 

(17) 


where w is an unknown offset in a direction normal to the image plane. It is worth noting that 
if the 3D rigid transform that brings the model into camera-centered coordinates is desired, it 
can be computed from these three camera-centered model points and the original three model 
points. The unknown offset w drops out when computing the rotation and remains only in the 
z coordinate of the translation, which cannot be recovered. As mentioned in Section 2, a simple 
method for computing the transform is given in Appendix A, and a least-squares solution was 
given by Horn [14]. 


4 Existence and Uniqueness of the 3D Pose Solution 


In deriving the 3D pose solution, I start with the basic geometry for the weak-perspective three- 
point problem, shown in Fig. 2. Fig. 3 shows the smaller sohd again with more labels. There 
are three right triangles in the solid, from which three constraints can be generated: 


h? + ^oi 

= (sRoi ) 2 

(18) 

h 2 + d 02 

= (sRq 2 ) 2 

(19) 

h 2 ) 2 + d \ 2 

= (sRi 2 ) 2 

(20) 


It should be pointed out that the distances Roi, Rq 2 , Ri 2 , dm, ^ 02 , ^12 and the scale factor s 
are all positive, but the altitudes hi, h 2 along with H 1 , H 2 are signed. Since h\ and h 2 are 
signed, having “hi — h 2 ” in the third equation is an arbitrary choice over “hi + h 2 ”; it was 
chosen because, when hi and h 2 are positive, it directly corresponds to the picture in Fig. 3. 

Multiplying the third equation by —1 and adding all three gives 

2hih 2 = s 2 (Rq 1 + Rq 2 — R\ 2 ) — (c?oi + dg 2 — d 2 2 ). 


Squaring and using the first two equations again to eliminate h \ and h\, we have 

4 (s 2 Rli ~ dh)(s 2 Rl 2 - d 2 2 ) = (s 2 (i?o! + Rq 2 ~ R\ 2 ) ~ (d 2 0 i + ~ rf ? 2 )) 2 , 

which leads to a biquadratic in s (for details see Appendix B): 
as A — 2 bs 2 + c = 0, 


( 21 ) 

( 22 ) 
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where 

a = l/i'fii Rq 2 — (i& + -R02 — R12 ) 2 

= (-Roi + -R02 + -R12 )( —-Roi + Ro-2 + -R12 )(-Roi “ ^02 + -Rl2 )(-Roi + ^02 ~ Rl2 ) 

b = 2Rq 1 (Iq 2 . + -2RlAl — (Rh + Ro 2 — Rl2 )( ^01 + f ^02 — ^ 12 ) 

= ^Ol( _ -^01 + R02 + -K12) + ^02(^01 “ R02 + -K12) + ^12(^01 + R02 - R12) 

C = 4(1^ (/q 2 — (f/gi + do 2 ~ ^12 ) 2 

= (f^Ol + do2 + ^12 )( ^01 + do-2 + f/l2)(^01 — do-2 + f/l2)(^01 + <^02 — ^ 12 ) 

This biquadratic is equivalent to the one originally derived by Ullman. But Ullman made no 
attempt to interpret or decide among its solutions, which will be done here. 

To prove existence and uniqueness, the biquadratic’s solutions must be examined. We are 
interested only in positive, real solutions for s, the scale factor. In general, the positive solutions 
of the biquadratic are given by 

b ± Rb 2 — ac 
a 

Depending on the radicands, there will be zero, one, or two real solutions. Particularly, we 
are interested in whether each number of solutions can arise, and, if so, to wliat the solutions 
correspond geometrically. 

To begin, let us determine the signs of a, b, arid c. In Fig. 2, let cj) denote the angle between 
rh\ — rho and m 2 — mo, and let (/’ be the angle between U — *o and 12 — *o- Notice by the law 
of cosines that 

a = 4fj > o 1 fj > o 2 - (2R 01 Ro2 cos (f>) 2 

= 4(R 01 Ro2sm^) 2 (24) 



(23) 
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b 


( 25 ) 


— 2 R^d 

02 + 2R 2 02 d 01 _ (2^01^02 COS <p)(2d 0 ld 0 2 COS Ip) 
= 2 (i?Q 1 C?Q 2 + Ro 2^01 — 2 i?oi-Ro 2 ^ 01^02 cos <f> cos ip) 


C — ^"^ 01^02 (2(^01^02 cos 'p) 

= 4(d 01 d 0 2sm'ip ) 2 (26) 

Further, \Rq\Rq 2 sin<^> equals the area of the model triangle, so that a measures sixteen times 
the square of the area of the model triangle. Analogously, c measures sixteen times the square 
of the area of the image triangle. 

In what follows, I assume that the model triangle is not degenerate, that is, not simply a 
line or a point. This situation is the only time the solution is not guaranteed to exist. (For 
a discussion of this case see Section 4.5.) Note that this assumption implies that a / 0 and 

<P 7 ^ 0 . 

From Equations 24 and 26, clearly a > 0 and c > 0. From Equation 25, it is straightforward 
to see that b > 0 : 

b = 2 (i?Q 1 C?Q 2 + Ro 2^01 ~ ^RoiRo 2 do\dQ 2 cos (p cos 'tp) 

> 2 (i?Q 1 dQ 2 + Rm^oi ~ 2 i?oi-Ro 2 ^oi^ 02 ), since cos (p < 1 , cos p) < 1 
= 2(f?oido2 — -Ro2^oi) 2 

> 0 


Returning to Equation 23, Appendix E shows that b 2 — ac > 0. From this fact and that 
a > 0 , b > 0 , and c > 0 , we can derive that there are in general two solutions for s with a single 
special case when b 2 — ac = 0 , which can be seen as follows: 

b 2 — ac > 0 =y b ± \/b 2 — ac > 0 , since b > 0 and ac > 0 
b ± \Jb 2 — ac 

=y - > 0, since a > 0 

a 


Hence 


s = 


lb±VF r ^ 


ac 


which gives one or two solutions for the biquadratic, depending on whether b 2 — ac is positive 
or equal to zero. 

Next, I show that of the two solutions for the scale, exactly one of them is valid, that is, cor¬ 
responds to an orthographic projection of the model points onto the image points. Furthermore, 
the other solution arises from inverting the model and image distances in Fig 2. In addition, 
there being one solution for scale corresponds to the special case in which the model triangle is 
parallel to the image plane. The following proposition, which is proved in Appendix C, will be 
useful in establishing these claims. 

Proposition 1: Let 


si 


b — \Jb 2 — ac 

- s 2 

a 


b + \Jb 2 — ac 
a 


(27) 
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Then 


d 0 i d 02 

Si < -,- < So. 

1 - T) 1 T) - " 

ft 01 ftQ2 


4.1 The true solution for scale 


Here it is shown that exactly one of the two solutions for scale can satisfy the geometry shown in 
Fig. 2, and it is always the same one. If the two solutions are the same, then both solutions can 
satisfy the geometry (this case is discussed in Section 4.3). As will be seen, the valid solution is 

s = 

Note that proving this statement establishes the existence and uniqueness of the solution given 
in Section 3. 

In Fig. 2, ( sRoi ) 2 — d ^ )1 = h\ > 0 and ( sR 02) 2 — d ^ 2 = — 0, which implies that any 

solution s for scale satisfies 

^01 ^ 1 d 0 2 

—— < s and —— < s. 

Roi Ro 2 

Consequently, Proposition 1 implies that S2 is the only possible solution. Still, the question 
remains whether S2 is itself a solution; the fact that it satisfies the biquadratic (Equation 22) 
is not sufficient since the steps used to derive the biquadratic from Equations 18-20 are not 
always reversible due to the squaring used to obtain Equation 21. 

Next, I show that S 2 is indeed a solution by giving an assignment to the remaining variables 
that satisfies the constraints in Equations 18-20. Since ( sRqi ) 2 — c?oi — 0 and ( s -^ 02 ) 2 — d $ 2 > 0, 
we can set h\ = ( sRqi ) 2 — dh and h 2 = (S-R 02) 2 — d^ 2 , which immediately give Equations 18 
and 19. Furthermore, we know s satisfies Equation 22, or, equivalently, Equation 21. Substitute 
hi and h 2 into the left-hand side of Equation 21: 

Ah\h 1 = ^s 2 (i?oi + R02 ~ R12) ~ (^01 + dg 2 — d\ 2 )^ . 
which is the same as 

±2/ii/i2 = s (Rqi + Rq 2 ~ R12) ~ (doi + d 02 — d 12 ). 

At this point, we are free to choose the signs of hi and h^- In particular, let the sign of h\ 
watch the sign on the left-hand side so that 

2 h t h 2 = s 2 (i?or + Rm ~ R 12 ) ~ ( d li + d l 2 ~ d\ 2 ). (29) 

Once this choice is made, we are forced to choose the sign of h? to make the sign of the left-hand 
side consistent with the right-hand side. In particular, let a be the sign of h^- Then unless the 
right-hand side is 0, 

1 if s 2 {dl 1 + d 2 02 -d\ 2 )< R 2 t + Rq 2 ~ r 2 12 i 
-1 if s 2 (d 2 1 + d 2 2 - d\ 2 ) > R 2 01 + R 2 2 - R\ 2 . 
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On the other hand, if s 2 (R 2 2 ~ -Rqi — -R(j 2 ) = ^12 — f ^oi — (?q 2 , tl ien Equation 29 implies hi or 
h -2 is 0, so that the sign of h -2 is not forced and so is arbitrary. Having chosen the sign of h. 2 , 
substituting h 2 and h 2 into the right-hand side of Equation 29 gives 

'2hih 2 = h'l + h 2 - (s 2 R\ 2 - d\ 2 ), 
or 

(hi - h 2 ) 2 = s 2 Rl 2 - d\ 2 , 
which is Equation 20. 

Returning to the signs of hi and h^, there is two-way ambiguity in the sign of hi which 
imposes the same two-way ambiguity on the pairs (hi, h- 2 ) and (ihi,^)- As can be seen in 
Fig. 2, the ambiguity geometrically corresponds to a flip of the plane containing the space points 
m 0 , mi, and m 2 . The flip is about a plane in space that is parallel to the image plane, but 
which plane it is cannot be determined since the problem gives no information about offsets 
of the model in the s direction. Due to the reflection, for planar objects the two solutions are 
equivalent, in that they give the same image points when projected. O 11 the other hand, for 
lion-planar objects the two solutions project to different sets of image points. 

There is a special case, as mentioned above, when the sign of h .2 is arbitrary relative to the 
sign of hi. I 11 this case, the right-hand side of Equation 29 is zero, and this implies that hi or 
/?2 is zero also. Looking at Fig. 2, geometrically wliat is occurring is that one of the sides of the 
model triangle that emanates from mo lies parallel to the image plane, so that the reflective 
ambiguity is obtained by freely changing the sign of the 11011 -zero altitude. 

4.2 The inverted solution for scale 

Of the two solutions for scale that satisfy the biquadratic, we know that one of them 
corresponds to the geometry in Fig. 2, but wliat about the other? Using a similar argument to 
that used to prove S 2 is a solution for the weak-perspective geometry, we can infer a geometric 
interpretation for sq. Consider, then, s = sq. The interpretation I will derive satisfies the 
equations, 

H 2 + R 2 01 = (rr/ 01) 2 (30) 
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m 


(H x -H 2 


R 2 02 = M02) 2 

R \2 = {rd 12 f, 


where r = Observe that ?> = ~ and s - 2 have similar forms (see Equation 27): 


r = 


b — \/b 2 — ac 


lb+ 


ac 


(31) 

(32) 


(33) 


To begin the derivation, Proposition 1 gives that — (.si?oi) 2 > 0 and c/q 2 ~ ( s Ro 2) 2 > 0, 
which implies we can set h 2 = c/^ — ( sRq\ ) 2 and h ' 2 = c/q 2 — ( sRq - 2 ) 2 . Dividing through by .s 2 gives 
Equations 30 and 31. As before, since s satisfies Equation 22 and, equivalently, Equation 21, 
we can substitute into Equation 21 with /?, 2 and h ' 2 to obtain 

(/?i - h 2 f = d \ 2 - s 2 R\ 2 , 


where the sign of h 2 relative to hi is 1 if c/^ + c/q 2 — c/ 2 2 > -s 2 ( -Rqi + Rq 2 ~ R 2 - 2 ), and — 1 otherwise. 
Dividing through by .s 2 gives Equation 32, and so the derivation is completed. 

Geometrically, Equation 30 forms a right triangle with sides Hi and -Ron and hypotenuse 
rdoi. Analogously, Equations 31 and 32 imply right triangles as well. The interpretation is 
displayed in Fig. 4. Another way to see wliat is occurring geometrically is to note that the roles of 
image and model distances from Equations 18-20 are inverted in Equations 30-32. In effect, wliat 
is happening is that instead of scaling down the model triangle and projecting it orthograpliically 
onto the image triangle, the image triangle is being scaled up and projected orthograpliically 
onto the model triangle, that is, projected along parallel rays that are perpendicular to the 
model triangle. This interpretation is shown in Fig. 5 as a rotated version of Fig. 4. 
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4.3 Model triangle is parallel to the image plane 

The two solutions for the scale factor are the same when b 2 — ac = 0, and here I demonstrate that 
geometrically this corresponds to the plane containing the three model points being parallel to 
the image plane. Before proving this, let us establish the existence of the solution for scale in 
this special case of b 2 — ac = 0. Looking at Equation 23, 


b 2 — ac = 0 


b ± \Jb 2 — ac = b 

rt[ 

a 


s = 


is a solution to the biquadratic since a > 0 and b > 0. 

Appendix D shows that b 2 — ac = 0 exactly when (p = or <p = + 7r and 

Using this result and Equations 24 and 26, 


s = 



\d 01 d 02 sin <f>\ _ do]_ _ d 02 

I-R01-R02 sin V’l -R01 Rq 2 


(34) 


=y h t = \J (sRoi ) 2 - d 2 01 = 0 
h 2 = \]{sR 02 ) 2 - d 2 m = 0. 

Thus b 2 — ac = 0 only if the model triangle is parallel to the image plane. 

Conversely, if the model triangle is parallel to the image plane, it must be that cj) = r/n 
Further, in this case h\ = h 2 = 0, so that 

_ ^01 _ do 2 
Roi Ro 2 ’ 

which from Appendix D implies that b 2 — ac = 0. 

Since the two solutions are the same, we know that si = s 2 = Notice in Figs. 3 and 4 
that the geometric interpretations for the two solutions for scale collapse to the same solution 
when h\ = h 2 = Hi = H 2 = 0 and s = As a result, when there is one solution for scale, 
there is also one solution for (hi,h 2 ) and (Hi,H 2 ), albeit (0,0). 


(35) 


4.4 Model triangle is perpendicular to the image plane 

The situation where the model triangle is perpendicular to the image plane is of interest since 
the projection is a line. Note, however, that the solution given earlier makes no exception for 
this case as long as the model triangle is not degenerate. As for what happens in this case, 
since the image triangle is a line, we know ip = 0 =>■ c = 0 =>■ Equation 23 becomes 


s = 


b±Vt) 2 

a 



(36) 
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As shown above, of the two solutions for scale, the true one is \ — and the inverted one is 0. 

To see why the inverted solution is zero, recall that the solution can be viewed as scaling 
and projecting the image triangle onto the model triangle, using for scale r = which in this 
case does not exist. Since the image triangle is a line, graphically this amounts to trying to 
scale a line so that it can project as a triangle, which is not possible. 


4.5 Model triangle is a line 

This is the one case where the solution for the scale fails, and it fails because a, which is 
a measure of the area of the model triangle, is zero. Despite this fact, we can determine 
when a solution exists. First, we know that the image triangle must be a line as well. To 
see if this condition is enough, consider looking for a 3D rotation and scale that leaves sm\ 
orthograpliically projecting onto d as in Fig. 6. Observe that every such rotation and scale 
leaves sm -2 projecting onto the same point in the image. This means is that for a solution to 
exist, it must be that 

dor _ do-2 

Roi Ro-2 

Even when the image triangle is a line :> this in general is not true. When it is true, there is an 
infinity of solutions corresponding to every scaled rotation that leaves sm\ projecting onto *i. 

Another way to look at this situation is to notice that the model triangle being a line when 
using the true solution is analogous to the image triangle being a line when using the inverted 
solution, where the roles of the model and image triangles are reversed. As discussed in the 
previous section, the image triangle is a line when the model triangle is perpendicular to the 
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image plane. The analysis there reveals that for the inverted solution the scale factor r is 
undefined, which means that here the true solution for the scale factor s is undefined as weU. 

4.6 Summary 

Our goal was to determine the three unknown parameters of the geometry displayed in Fig. 3, 
namely s, hi, and h^- The figure gave three constraints (Equations 18-20), from which a 
biquadratic in the scale factor s was derived. The biquadratic always has two positive solutions, 
and its coefficients, a , b, and c, are all non-negative. Of the two solutions, Section 4.1 showed 
that one and only one can satisfy the three constraints, and that solution is s = S2 from 
Proposition 1 (see Equation 27). Given s, there are two pairs of valid assignments for hi and 
/& 2 - They correspond to reflecting the plane of the three matched model points about any plane 
parallel to the image; all planes parallel to image plane are equally-good. This proved that the 
solution for 3D pose exists and is unique up the reflective ambiguity. 

In Section 4.2, Proposition 1 was used to infer the geometry that gives rise to the other 
solution to the biquadratic, namely s = si (Equation 27). This solution, which is illustrated in 
Fig. 5, is obtained by inverting the roles of the model and image points in Fig. 3. The difference 
with the inverted solution is that the image points are being scaled and then orthographically 
projected onto the model points, instead of the reverse. The inverted geometry satisfies three 
constraints, Equations 30-32, that parallel the true constraints in function and form. Similarly, 
the expression for the scale factor of the inverted solution, r = (Equation 33), parallels the 
expression for the true scale factor, s = S2- 

Three special cases were discussed next, one in which the plane of the matched model points 
is parallel to the image plane (Section 4.3), one in which it is perpendicular to the image plane, 
or, equivalently, in which the matched image points are collinear (Section 4.4), and one in which 
the matched model points are collinear (Section 4.5). The first case is the one and only situation 
in which the two solutions collapse to the same one, and in this case hi = h 2 = 0. In addition, 
this situation is exactly where the two solutions to the biquadratic are the same; this is seen 
geometrically by looking at Figs. 3 and 4 with hi, h^, Hi, and H 2 all zero and s = si = S2 = 

In the case where the matched image points are collinear, the solution for 3D pose is still 
valid. It is interesting to note, however, that the inverted solution for the scale factor does not 
exist. Yet the inverted solution for the scale does exist when the model points are collinear, but, 
in this case, the true solution does not. Section 4.5 determined that when the model points are 
collinear a solution for 3D pose may still exist, but if and only if a further constraint is satisfied. 
The section concludes by giving the constraint and describing how it arises geometrically. 

5 Image Position of a Fourth Model Point 

To compute the position in the image of a fourth model point, I first use the solution from 
the previous section to compute its 3D position in camera-centered coordinates. By so doing, 
I can project the camera-centered model point under weak-perspective and obtain the image 
position without having to calculate a model-to-image transformation. Let the image points be 
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i 0 = ( xo,yo ), i\ = (*i,gi), and i 2 = (a: 2 7 2 / 2 )- Given s, /ii, / 12 , we can invert the projection to 
get the three model points: 

too = \x 0 ,yo,w) 

mi = \x 1 ,y 1 ,h 1 +w) 

m 2 = ^(x 2 ,y 2 ,h 2 + w), 

where w is an unknown offset in a direction normal to the image plane. 

Given three 2D points, go, gj, and gl, a fourth 2D point gl can be uniquely represented by 
its “affine coordinates,” ( a , (3), which are given by the equation 

q 3 = a{q{ - q 0 ) + [i{q 2 - q 0 ) + q 0 . 

Given three 3D points, po, Pi, and p 2 , this representation can be extended to uniquely represent 
any other 3D point pi in terms of what I shall call its “extended affine coordinates,” (a , /3, 7 ), 
as follows: 

pi = a (pi - po) + f3(p 2 - po) + 7 (pi - pi) X (pi - pi) + pi (37) 

Let 


*01 *1 *0, yoi y\ go, 

*02 = x 2 — x 0 , y 02 = y 2 — y 0 . 

Then, using the three model points with po = mo, pi = mf, and pi = m 2 , 

Pl-Po = ^(*oi, 2 /oi,M 

Pl-Po = “(*02, 2/02, h 2 ) 

(pl - Po) X (pi - Po) = ^{yoih 2 - yo2hi,xo 2 hi - xo 1 h 2 ,x 01 yo 2 - £022/01) • 

Next, substitute Equations 38-40 into Equation 37 to get the three-space location of the fourth 
point: 

m 3 = ^a(x 0 i,yoi,hi)+ ^(3(x 02 ,yo 2 ,h 2 ) 

+7-7(2/01^2 - yo2hi,-x 01 h 2 + x 02 h 1 ,x 01 yo 2 - x 02 y 0 i) + ~(x 0 ,yo,w) 


(38) 

(39) 

(40) 


1 , . a , yoih 2 - y 02 h 1 

-{ax 01 + pxo 2 + 7-b xo, 

s s 


« 2 /oi + (iyo 2 + 7 


— xoih 2 + XQ 2 hi 


2 /o, 


a/ii + / 3h 2 + 7 


*012/02 — *022/01 


w 


(41) 
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To project, first apply the scale factor s: 

_ . „ Vmho — Va 2 h-\ 

sms = (ax 01 + /3 x 0 2 + 7 - -b *0, 


ayoi + Pyo2 + 7 


— * 01^-2 + * 02^1 


y 0 , 


, ,01 . * 011/02 - * 021/01 N 

afil + /im 2 + 7-b W) 


( 42 ) 


Let II represent an orthogonal projection along the z axis. Then project orthographically to 
get the image location of the fourth point: 


n(sTO 3 ) = (ax 01 + (3 x 02 + 7(!/oi-02 - 1/02 -Hi) + *0, 

ayoi + [iyo2 + 7(-*oi- 02 + *o20i) + Vo) 


(43) 


Notice that the unknown offset w has dropped out. This expression computes the image position 
of ps from its extended affine coordinates, from the image points, and from Hi and H 2 , the 
altitudes in the weak-perspective geometry. There are no intermediate results about the actual 
3D pose stored along the way, and as a result, this computation should be very efficient. 
Nonetheless, it should be kept in mind that Hi and H 2 depend on the specific imaging geometry; 
that is, they depend on the pose of the model. 

It may be worthwhile to observe that Equation 43, the expression for the fourth point, can 
be rewritten as a weighted sum of the three image points: 


n(sTO 3 ) = (axoi + fix 02 + 7 ( 1 / 01-02 - 1/02-Si) + *0, 
ayoi + fiyo2 + 7 ( *oi-02 + *0201 ) + l/o) 


= (axi + ~(H 2 yi,ayi - 7_0 2 *i) - («*o + 7-021/0, a>y 0 - 702 *o) + 
(fix 2 ~ lHiy 2 ,l3y 2 + 701 * 2 ) - (/3*o - 70i!/o, Pyo + 70i*o) + 
(*o,l/o) 

1 — a — fi 
_ -7(0i - 02 ) 

a 7^2 
-702 a 

Let Rg represent a 2D rotation matrix that rotates by an angle 9. Then 

II(sm 3 ) = <5 o R0 o *o + + <5 2 R0 2 *2, 

where 

/>o = ij (1 — a — fi) 2 + ( 7 (0i — 0 2 )) 2 

Si = y/a* + ( 7 R 2) 2 

^2 = ^/ 3 2 + ( 7 0 i) 2 


7 ( 01 - 02 ) 
1 — a — fi 



*0 


. y ° . 


Xi 

yi 


P -701 
701 P 



*2 


. V2 . 


(44) 

(45) 

(46) 

(47) 
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( 48 ) 


COS 00 = 

COS 0\ = 

COS 02 = 


1 — a —(3 

So 


I 

s 2 


sin 0 O = 
sin 0 \ = 
sin 02 = 


So 

-lH 2 

6 a 

§2 


Thus, we can view the computation as a 2D rotation and scale of each image point separately 
followed by a sum of the three. It is important to keep in mind, however, that the rotations 
and scales themselves depend on the image points, because of Hi and H^- 

When the model is planar, the form of Equation 44 facilitates understanding the effects 
of error in the image points. Error in the locations of the matched image points leads to 
uncertainty in the image location of the fourth model point. Suppose that the true locations of 
the matched image points are known to be within a few, say C{, pixels of their nominal locations, 
for i = 0,1,2. Let 7 and c) be the true and nominal locations of an image point, for i = 0,1,2. 
Then, for some e*o, io = cq + e'en where || e*o ||= cq, and similarly for i\ and Then 


n(sm 3 ) = <5 o R0 o i o + + <5 2 Re 2 * 2 

= (£oRe 0 co + ^lRfljCi + 6 2 Re 2 C2) + (^oR0 o e o + ^lRe-^ei + ^R^^) 

When the fourth point is in the plane of the first three, 7 = 0, so that the scales, So, Si, and 62 , 
and 2D rotations, Re 0 , R^, and R g , are all constant (see Equations 45-48). This means that 
the first term in parentheses is just the nominal image location of the fourth model point. Since 
e"o, ej, and d) move around circles, the 2D rotations in the second term can be ignored. Further, 
since these error vectors move independently around their error circles, their radii simply sum 
together. Therefore, the region of possible locations of the fourth model point is bounded by a 
circle of radius <5o e o + S\€i + 62^2 that is centered at the nominal point. By plugging 7 = 0 into 
Equations 45-47, we get that 


S 0 = \l- a -f3\, = M , S 2 = |/3| , 


Assuming e 0 = e\ = e 2 = D this implies that the uncertainty in the image location of a fourth 
point is bounded by a circle with radius (|1 — a — [3\ + |a| + |/3|)e and with its center at the 
nominal point, which repeats the result given earlier by Jacobs [19]. 

Although the non-planar case clearly is more complicated, since the scales and 2D rotations 
are no longer constant, Equation 44 may prove useful for obtaining bounds on the effects of 
error in this situation as well. 


6 Stability of the 3D Pose Solution 

In numerical computations, it is well-advised to determine whether a computation is stable, 
since, if not, it could produce inaccurate results. A computation is unstable if any roundoff 
error can propagate and magnify such that the true answer is significantly altered. The most 
common source of roundoff error is known as catastrophic cancellation, where two numbers of 
nearly equal magnitudes and opposite signs are summed. In fact, catastrophic cancellation is 
the only way a sudden loss of precision can occur [31]. Otherwise, in general precision can be 
lost by an accumulation of small errors over several operations. 
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In the 3D pose solution, there are a few subtractions of positive numbers to be wary of. In 
computing h\ and hz from s (Equation 7), the values of h\ and hz may have little precision if 
cancellation occurs in the radicands, in which case h\ or hz will be small relative to its range 
of values. As discussed at the end of Section 4.1, h\ or hz is zero when one of the sides of the 
model triangle that emanates from too lies parallel to the image plane. 

The calculation of h\ and hz can also be unstable if s is inaccurate. Looking at Equation 6 
and recalling that a, b, and c are non-negative, catastrophic cancellation can only occur in the 
inner radicand. Even if it does, this is not a problem, since the result of the square root would 
be negligible when added to b. 

Another way for s to become inaccurate is if the value of a, b, or c in Equation 6 is obtained 
with little precision. For a and c, Equations 9 and 11 show in parentheses one of the sides of 
a triangle being subtracted from the sum of the other two; therefore, catastrophic cancellation 
may occur when the triangle is nearly a line. Equation 10 shows that cancellation may occur 
in computing b if either the terms in parentheses or the total sum approaches zero relative to 
their ranges of values. From the law of cosines, the terms in parentheses are near zero when 
some angle of the model triangle is small. From Equation 25, the total sum, i.e., b, is small 
only if certain angles in the model and image triangles are small also. This says we should be 
careful of b in the same circumstances in which we are careful of a and c, namely, when the 
model or image points are nearly collinear. 

To conclude, the parameters s, hi and hz (or s, H i, and Hz) are prone to instability when 
the matched model or image points are almost collinear, and, additionally, H\ or Hz can be 
unstable when one of the vectors from too to toi or TO 2 is nearly parallel to the image. In the 
latter case, the unstable H\ or Hz is close to zero. If only one of H\ and Hz is close to zero, 
then the instability can be avoided by re-ordering the matched points to make both H\ and 
Hz large. However, if this is done, the difference H\ — Hz will be close to zero and may be 
imprecise. If both H\ and Hz are almost zero, which means the model triangle is nearly parallel 
to the image, then re-ordering the matched points will not help. 

Finally, it is worth observing that much of the instability in the pose solution occurs at 
places in which the problem is ill conditioned , that is, places where instability is inherent in 
the geometry. For instance, Hi was said to be unstable when the vector from too to rh\ is 
nearly parallel to the image. Geometrically, in this situation a small change in the position of 
i\ can cause a large change in the altitude H\ (Fig. 2). For the same reason, recovering the 
altitude Hz is unstable when the vector from too to rriz is nearly parallel to the image. This 
situation would be worse if both vectors emanating from too were parallel to the image. By 
a similar argument, it is intrinsically unstable to recover the pose when the model points are 
nearly collinear, due to there being an infinity of solutions when the model points are exactly 
collinear (Section 4.5). 

This suggests that recognition systems like alignment and pose clustering should give special 
attention to situations where the model triangle is almost a line and where the model triangle 
being viewed straight on. These cases could be avoided by checking if the model points are 
nearly collinear or if the corresponding angles between the model and image points are very 
close. For the latter case, the suggestion does not apply if alignment is being used to recognize 
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planar models. This is because, if Equation 14 is used, error in H\ or H 2 has no effect on the 
image locations of points in the plane, since for these points 7 = 0 . 


7 Review of Previous Solutions 

There have been several earlier solutions to the weak-perspective three-point problem, notably 
by Kanade and Render [20], Cyganski and Orr ([7], [ 8 ]), Tillman ([28], [17]), Huttenlocher 
and Ullman ([16], [18], [29]), and Crimson, Huttenlocher, and Alter [12]. All the previous 
solutions compute the 3D pose by going through a 3D rigid transformation or a 2D affine 
transformation relating the model to the image. A 2D affine transform is a linear transform 
plus a translation, and it can be applied to any object lying in the plane. All but Ullman’s 
and Crimson, Huttenlocher, and Alter’s solutions compute an affine transformation between 
the three model and image points. Also, all but Kanade and Render’s solution compute a 
model-to-image rigid transformation, either via a rotation matrix or via Euler angles. 

Not all of the solutions directly solve the weak-perspective three-point problem. The earliest 
solution, which was given by Kanade and Render in 1983, applies Kanade’s skewed-symmetry 
constraint to recover the 3D orientation of a symmetric, planar pattern [20]. More precisely, 
Kanade and Render showed how to compute the 3D orientation of the plane containing a 
symmetric, planar pattern from a 2D affine transform between an image of the pattern and 
the pattern itself. To apply this result to the weak-perspective three-point problem, the three 
points can be used to construct a symmetric, planar pattern, and a 2D affine transform can be 
computed from two sets of three corresponding points. The solution was shown to exist and to 
give two solutions related by a reflective ambiguity, assuming that the determinant of the affine 
transform is positive. 

The remaining methods all concentrate on computing the 3D rigid transform from the model 
to the image. In 1985, while presenting a system for recognizing planar objects, Cyganksi and 
Orr showed how to use higher-order moments to compute a 2D affine transform between planar 
regions ([7], [ 8 ]). Given the affine transform, they listed expressions for computing the 3D 
Euler angles from the 2D affine transform 1 . They did not, however, discuss how they derived 
the expressions. 

The next method is the solution given by Ullman in 1986 [28], which appeared again in [17]. 
The paper included a proof that the solution for the scale factor is unique and the solution for 
the rotation matrix is unique up to an inherent two-way ambiguity. (This corresponds to the 
ambiguity in H\ and H 2 ) But Ullman did not show the solution exists. When it does exist, 
Ullman described a method for obtaining the rotation matrix and scale factor. 

In 1988, Huttenlocher and Ullman gave another solution, and, in the process, gave the 
first complete proof that the solution both exists and is unique (up to the two-way ambiguity) 
([16], [18], [29]). Like Kanade and Render, and Cyganski and Orr, Huttenlocher and Ullman’s 
solution relies on a 2D affine transform. The solution itself is based on algebraic constraints 
derived from rigidity, which are used to recover the elements of the scaled rotation matrix. 

J The expressions that appear in [ 7 ] contain typesetting errors, but are listed correctly in [8]. 
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The last solution, which was published this year, was developed by Crimson, Huttenlocher, 
and Alter for the purpose of analyzing the effects of image noise on error in transformation 
space [12]. Towards this end, the method facilitates computing how a small perturbation in 
each transformation parameter propagates to uncertainty ranges in the other parameters. 

8 Presentation of Three Previous Solutions 

The solutions discussed in the previous section differ significantly in how they compute the 
transformation, and, as a result, each one can provide different insights into solving related 
problems, such as error analysis in alignment-based recognition and pose clustering. It seems 
useful, then, to present the previous solutions in detail, so they conveniently can be referred to 
and compared. 

The first method presented is Ullman’s solution, which the first part of this paper extended. 
After that, I give Huttenlocher and Ullman’s solution. Lastly, I present the method of Crimson, 
Huttenlocher, and Alter. I do not present Kanade and Render’s method nor Cyganski and Orr’s, 
because Kanade and Render did not directly solve the weak-perspective three-point problem, 
and Cyganski and Orr did not detail their solution. 

It should be pointed out that the presentations here differ somewhat from the ones given 
by the original authors, but the ideas are the same. Basically, the presentations emphasize the 
steps that recover the 3D pose while being complete and concise. For more details, the reader 
is referred to the original versions in the references. 

In the following presentations, we are looking for a rigid transform plus scale that aligns 
the model points to the image points. In all methods, we are free to move rigidly the three 
image points or the three model points wherever we wish, since this amounts to tacking on an 
additional transform before or after the aligning one. For example, this justifies the assumption 
made below that the plane of the model points is parallel to the image plane. 

For consistency, the same notation as in Sections 3 and 4 is used in the proofs that follow: 
Let the model points be mo, mi, m 2 and the image points be io, U, * 2 , with the respective 
distances between the points being f?oi, R 02 , and R 12 for the model points, and cfoi, do 2 , and 
di 2 for the image points. 

8.1 Overview 

This section provides an overview of the three methods. 

Initially, all three methods compute a transformation that brings the model into image 
coordinates, such that the plane of the three matched model points is parallel to the image 
plane and such that mo projects onto io, which has been translated to the origin. The three 
methods then compute the out-of-plane rotation and scale that align the matched model and 
image points. In so doing, the methods all end up solving a biquadratic equation. 

In Ullman’s method, the model and image points are further transformed via rotations 
around the z axis to align rh\ and i\ along the x axis. Then the 3D rotation matrix for rotating 
successively around the x and y axes is expressed in terms of Euler angles. This leads to a 
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series of three equations in three unknowns, which are solved to get a biquadratic in the scale 
factor. To get the elements of the rotation matrix, the solution for scale factor is substituted 
back into the original three equations. 

Instead of further rotating the model and image points, Huttenlocher and Ullman compute 
an affine transform between them, which immediately gives the top-left sub-matrix of the scaled 
rotation matrix. Then by studying what happens to two equal-length vectors in the plane, a 
biquadratic is obtained. The scale factor and the remaining elements of the scaled rotation 
matrix are found using the algebraic constraints on the columns of a scaled rotation matrix. 

Like Ullman did, Crimson, Huttenlocher, and Alter rotate the model further to align rh\ 
and i\. The desired out-of-plane rotation is expressed in terms of two angles that give the 
rotation about two perpendicular axes in the plane. Next, Rodrigues’ formula, which computes 
the 3D rotation of a point about some axis, is used to eliminate the scale factor and obtain two 
constraints on the two rotation angles. The two constraints are solved to get a biquadratic in 
the cosine of one of the angles. Its solution is substituted back to get the other angle and the 
scale factor, which can be used directly by Rodrigues’ formula to transform any other model 
point. 

As mentioned in the introduction, Ullman’s solution is incomplete because it does not show 
which of the two solutions for the scale factor is correct; actually, the solution is completed by the 
result given in Section 4.1 of this paper. Similar to Ullman’s method, Crimson, Huttenlocher, 
and Alter’s solution has the same drawback of not showing which solution to its biquadratic is 
correct. Huttenlocher and Ullman, on the other hand, have no such problem because it turns 
out that one of the two solutions to their biquadratic is obviously not real, and so it immediately 
is discarded. 

8.2 Ullman’s method 

This section gives Ullman’s solution to the weak-perspective three-point problem. The main 
idea is first to transform the three model points to the image plane and then solve for the scale 
and out-of-plane rotation that align the transformed points. 

Specifically, the model points first are rigidly transformed to put the three model points in 
the image plane with too at the origin of the image coordinate system and rh\ — too aligned with 
the x axis. After rigidly transforming the model points, the resulting points can be represented 
by (0,0,0), (ahi, 0, 0), and (x 2 , 2/2,0). Similarly, let the image points be rigid transformed to 
put i 0 at the origin and i\ — io along the x axis, and let the resulting image points be (0, 0, 0), 
(mi,0,0), and ( 2 : 2 , 2 / 2 , 0 ). 

Next, we break the out-of-plane rotation into a rotation around the x axis by an angle 0 
followed by a rotation around the y axis by an angle <f>, as pictured in Fig. 7. The corresponding 
rotation matrix is 


COS (j) 

0 

sin (j) 


' 1 

0 

0 

0 

1 

0 


0 

cos 0 

— sin 0 

— sin (j) 

0 

COS (j) 


0 

sin 0 

cos 0 
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cos cj) sin cj) sin 0 sin cj) cos 0 
0 cos 0 — sin 0 

— sin cj) cos cj) sin 0 cos cj) cos 0 


(49) 


After rotation and scale, (0,0,0), (,xT,0,0), and (x- 2 ,y- 2 , 0 ) become (0,0,0), (,i’i,0,^i), and 
(a,’ 2 , V 2 i z 2 )i respectively, where zi and Z 2 are unknown. Thus, we need to find 0 , (f>, and s 
such that 


•sR(.i’i, 0, 0) = (.i'i, 0, ) 

sR(aT 2 ,y2,0) = (x 2 ,y2,z 2 ) 


Expanding the first two rows of R yields three equations in three unknowns: 


•s.l’l COS (j) = , 1’1 

(50) 

sy 2 cos 0 = y 2 

(51) 

SX 2 cos cj) — sy 2 si n si n 0 = X 2 

(52) 

Fig. 7 gives a graphical interpretation of the first two equations. Substituting Equations 50 
and 51 along with expressions for sin<^> and sin 9 into Equation 52 yields a biquadratic in the 
scale factor s: 

as 4 — bs 2 + c = 0, 

(53) 

where 


a = xi 2 y 2 2 

(54) 

b = x\{x 2 2 + h 2 ) + xi 2 (x'l + y\) - ' 2 xiX 2 X\X 2 

(55) 

2 2 

C = x x y 2 

(56) 

The positive solutions for s are given by 


1 b ± \/b 2 — 4 ac 

" " V 2 a 

(57) 


23 






In general there can be one, two, or no solutions for s. Ullman makes no further attempt to 
determine when or if each solution arises, except to refer to a uniqueness proof he gives earlier 
in the paper. The uniqueness proof implies there can be at most one solution for s, but does 
not say which solution it is or whether it can be either one at different times. 

Given s, the rotation matrix R is obtained using cos d> = -^4- and cos 0 = in Equation 49. 
One difficulty with this is that we do not know the signs of sin 0 and sin (f>; this leaves four 
possibilities for the pair (sin #, sin (f>). In his uniqueness proof, Ullman points out that the 
inherent reflective ambiguity corresponds to multiplying simultaneously the elements r 13, r 23, 
v 31 , and v 32 of R by —1. In Equation 49, the signs of those elements also are inverted when 
both sin# and sin^ are multiplied by — 1 , which, visually, corresponds to reflecting the model 
points about the image plane (Fig. 7). Still, we have no way to know which of the two pairs of 
solutions is correct. One way to proceed is to try both and see which solution pair aligns the 
points. 


8.3 Huttenlocher and Ullman’s method 


First, assume the plane containing the model points is parallel to the image plane. Then 
subtract out too and io from the model and image points, respectively, to align them at the 
origin. Let the resulting model points be (0,0,0), (ah,2/i,0), and (ah, 2 / 2 , 0 ), and the resulting 
image points be ( 0 , 0 ), (aq, 2 / 1 ), and (a? 2 , 2 / 2 )- At this point, what is left is to compute the scaled 
rotation matrix that brings (ah, 2 / 1 , 0 ) and (ah, 2 / 2 , 0 ) to (aq 5 2 / 1 , z\) and (a? 2 , 2 / 2 , 22 ), respectively, 
where z\ and are unknown. That is, we need 


sR(ii, 2 /r, 0 ) = (a?i, 2 / 1 , z\) 
sR(af 2 , 2 / 2 , 0 ) = (to 2 , 2 / 2 , ^ 2 )- 


Letting hi = srn, /12 = sr 12 , etc., and focusing on the first two rows of the rotation matrix, 
we get two sets of equations: 


^ 11*1 + / 122/1 

= X\ 

(58) 

hi%2 + ^ 122/2 

= X 2 

(59) 

^ 21*1 + / 222/1 

= 2/1 

(60) 

^ 21^2 + ^ 222/2 

= 2 / 2 , 

(61) 


which give 


the top-left sub-matrix of the scaled rotation matrix. Note that this 


hi I 12 
hi I 22 

step fails if the determinent, ah 2/2 — aT 2 27i, equals zero. 

Next, we make a digression to consider what happens to two orthogonal, equal-length vectors 
in the plane, ej and d). Since e{ and d) are in the plane, we can apply the sub-matrix just 
computed to obtain the resulting vectors, ej / and dj/: 


ei 


^11 

I 12 


->/ 


I 12 

^21 

I 22 

el, 

<52 = 

^21 

I 22 


e~2 


(62) 
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Figure 8: Projecting two orthogonal same-lengtli vectors in Huttenlocher and Ullman’s 
method. 

When a model is transformed, el and e 2 undergo a rigid transformation plus scale before 
projection. As shown in Fig. 8, after transformation these vectors become e{' -\-c\z and el' -\-c 2 z. 
Since a scaled, rigid transform preserves angles and ratios of lengths between vectors, and since 
el ■ €2 = 0 and || el || = || el ||, it must be that 

(el' + cq~) • (el' + c- 2 ~j = 0 

II II + c r = ll e 2 || + c 2- 

These two equations simplify to 

cic- 2 = T’i 

cr - c 2 = k 2 

where 

h = -eq' • el' 
k-2 = || e<2 || - || el' || 

Substituting for c 2 = in the second equation leads to a biquadratic in cq 

C1 

c\ — k 2 c\ — k\ = 0 

The general solution is 



Conveniently, the inner discriminant always is greater than or equal to zero. Furthermore, since 
4 k\ > 0, the real solutions are given by 



( 63 ) 

( 64 ) 

( 65 ) 
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since otherwise the outer discriminant is less than zero. 

These two solutions for c\ give two corresponding solutions for C2, which from Fig. 8 can be 
seen to correspond to a reflection about the image plane. 

The solution for c2 does not work when c\ = 0 . In this case, 

c 2 = ± \J—k 2 = ±^/|| e[' || — || €2 || • ( 67 ) 

This gives two solutions for c 2 , if it exists, which can be seen as follows. Since c\ = 0 , ej ends 
up in the plane, so that that the length of ej is just scaled down by s, whereas the length of 
€2 reduces both by being scaled down and by projection. Consequently, || e 2 / ||<|| ej / ||, and, 
therefore, c 2 exists. 

Given c\ and c 2 , we can recover two more elements of the scaled rotation matrix. Since ej 
and e*2 are i n the plane, we know that sRej = ej / + c{z and sRe 2 = e 2 / + c 2 F. Focusing on the 
last row of the scaled rotation matrix, we get the two equations /31 = c\ and 132 = c 2 . 

At this point, we have the first two columns of sR, and, from the constraints on the columns 
of a rotation matrix, we can get the last column from the cross product of the first two. In 
total, this gives 


hi 

I12 

■j:(c2^2l 

~ Cl/ 22 ) 


hi 

I22 

“(cl/12 

— C 2 /n) 

(68) 

Cl 

C2 

\{hih 2 

— Z12Z21) 



Since the columns of a rotation matrix have unit length, we know 



Notice that the ambiguity in c\ and c 2 inverts the signs of the appropriate elements of the 
rotation matrix as discussed in Section 8 . 2 . 

8.4 Grimson, Huttenlocher, and Alter’s method 

Crimson et al. gave another solution to the weak-perspective three point problem in order to 
get a handle on how small perturbations affect the individual transformation parameters. 

To start, assume the plane containing the model points is parallel to the image plane. Next, 
rigidly transform the model points so that too projects to io and rh\ — mo projects along i\ — io- 
Let II represent an orthogonal projection along the z axis, and in general let v 1 - be the 2 D 
vector rotated ninety degrees clockwise from the 2D vector v. Then the translation is io — II too, 
and the rotation is about z by an angle ip given by 

cos ip = tooi • ioi, sin ^ = —tooi • *01- 

(see Fig. 9 ). 

At this point, assign tooi = toi — too, to02 = to 2 — too, ioi = h — io, and *02 = hr 2 — too- 
Also, consider the out-of-plane rotation to be a rotation about ioi by some angle 0 followed by 
a rotation about by some angle p. Let us compute where the vectors ioi and project to 
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after the two rotations and scale. To do this, we use Rodrigues’ formula: Let R~ p represent 
a rotation of a point p about a direction v by an angle r. Rodrigues’ formula is 

p = cos rp + (1 — cos t )( v ■ p)v + sin t(v X p). ( 70 ) 

Using the formula, we can compute 

R^ ± R , *or = cos — sin <pz (71) 


* o -! ,4> * 01 , 

R ± R 

i ^01 ?' 


= sin 9 sin pi oi + cos + sin 9 cos pz. 


Initially, moi was rotated about £ to align it with *oi- In order for the scaled orthographic 
projection of moi to align with *oi, Equation 71 implies that 


11 moi 11 cos P 

(Iqi 1 
R 01 cos p 


Then 


S 11R : ± R J 0 1 

*oi,0 *01,® 




^01 r 

~D *°1 

Roi 

——--(sin 9 sin <^*oi + cos #*oi) 

/(’lli cos p 


Next, we use the expressions in Equations 73 and 74 to constrain 9 and p such that mo 2 
projects along * 02 - When we aligned moi and *oi, mo 2 rotated to R~^,mo 2 - Since mo 2 has no £ 
component (by assumption), we can represent R~^,mo 2 by 

Ro-2 cos pi 01 + Rq2 sin £*oi , 
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where £ is a known angle. Consequently, the transformed, projected, and scaled TO02, which 
must equal *02, is 


sIIRsl R^ 

*01 >0 *01 ?' 


,(-Ro2 cos £* 0 i + R02 sin £*01^ 


= i?02 cos £(sIIRm R 


*01,0 *01 C 


}* 01 , 


R02 sin£(sIIRm R 


*m ,0 *01,# 01 J 


= l?02Cos£ ( ^* 0 i ) + R02 sin £ (—-(sin 6 sin <j>i 01 + cos 6 i L 


\ 


Roi U V 


01 


1 


Roi cos £> 


01 


doi Rq 2 / t f • i • flV . d 01 Rq 2 , . /)v± 

cos £ cos <p + sin £ sin <p sin o)iqi -\ -— ——(sin £ cos u)i Q1 . 


cos cj) Roi 

Similar to Rgy,b*02 5 we can represent *02 as 

*02 = d 0 2 cos a;ioi + d 0 2 sin 01*01, 
where 01 is known. By equating terms we get 
doi R02 


cos £> iEoi 


do 2 Roi 


: (cos £ cos (f) + sin £ sin £> sin 0 ) = cos £> cos 01 


^01 Ro 2 


(sin £ cos 0 ) = cos £> sin 01. 


do2 Roi 

These two equations can be solved to get a biquadratic in cos £>: 
sin 2 01 cos 4 (j) — (t 2 + 1 — 2 t cos 01 cos £) cos 2 £> + t 2 sin 2 £ = 0, 

where 

Ro 2 doi 


t = 


Roi d\ 


01«02 


Since Ry^moi is aligned with *oi 5 we need cos£> to be positive so that tooi projects in the 
direction as *oi- The positive solutions are given by 


cos cj) = 7- -\ v ± \/z/ 2 — t 2 sin 2 u sin 2 £ 

smu V v 


with 


z/=-(l + f 2 — 2 t cos uj cos £). 

This equation gives up to two solutions, but Crimson et al. make no further attempt to 
which solutions exists when, except to say the equation gives real solutions only if v > 0 


cos lo cos £ < 


1 + t 2 
21 


( 75 ) 

( 76 ) 

( 77 ) 

( 78 ) 

same 

( 79 ) 

show 

or 

( 80 ) 
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Given £>, Equations 75 and 76 provide 9\ 


sin lo cos cj) 

cos 0 = --— 

t sin £ 

COS £>(COS LO — t cos £) 

sin 0 = - : ——— - 

t sin £ sin <p 


( 81 ) 

(82) 


Given any model point to, we can use the computed angles along with Rodrigues’ formula 
to find its image location. In particular, once too and io have been subtracted out, only the 
scale and 3D rotation are left. The scale is given by Equation 72, and, as shown above, the 
rotation is 


R ± R R , 

*oi>‘ i> *01-0 Z ’V 


(83) 


As with Ullman’s method (Section 8.2), we do not know the signs of sin# and sin£>, but only 
that inverting both signs simultaneously corresponds to the reflective ambiguity. 


8.5 Summary of the three computations 

Here I summarize how each method can be used to compute 3D pose from three corresponding 
points. To begin, transform the model and image points so that (1) the model points lie in the 
image plane, (2) too and io are at the origin of the image coordinate system, and (3) rh\ — too 
and i\ — io lie along the x axis. Then use one of the three methods to compute the scale factor 
and out-of-plane rotation, as follows: 

• Ullman’s method 

1. Use Equations 54-56 to get a, b, and c. 

2. Substitute a, b, and c into Equation 57 to get s. 

3. Calculate cos 6 = and cos 8 = 

r sx! sy 2 

4. Calculate sin £> = \/l — cos 2 £> and sin 8 = i/l — cos 2 0. 

5. Construct the rotation matrix R using Equation 49. 

• Huttenlocher and Ullman’s method 

1. Solve Equations 58 and 59 for In and / 12 , and Equations 60 and 61 for I 21 and 1 22 - 

2. Let e"i = (0,1) and e "2 = (1,0). (Any orthogonal, equal-length vectors can be used.) 

3. Use Equation 62 to get e[' and € 2 '■ 

4. Substitute e[' and e 2 ' into Equations 63 and 64 to get £q and & 2 - 

5. Substitute £q and &2 into Equation 66 to get c\. 

6 . If ci 7 ^ 0, calculate C 2 = ■^ L . Otherwise get C 2 from Equation 67. 

7. Use Equation 69 to get s. 
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8 . Use Equation 68 to get sR. Divide through by s if R is desired instead of sR. 

• Crimson, Huttenlocher, and Alter’s method 

1. From the model points, compute i?oi, i ?02 and £, and, from the image points, com¬ 
pute doi, do 2 , and uj. 

2. Use Equation 78 to get t. 

3. Use Equation 79 to get cos(f>. 

4. Use Equation 72 to get s. 

5. Calculate sin (f> = \/l — cos 2 (j>. 

6 . Use Equations 81 and 82 to get cos 0 and sin 0. 

7. To transform any point p, substitute cos(f>, sin<^>, cos 9, sin 9, and p into Rodrigues’ 
formula, Equation 70, to get R p = Rs ± ,Rs „p. 

*01 >0 *01 W 

9 Stability and Comparison of Three Previous Solutions 

For computing 3D pose, it is desirable to know how the solutions compare in terms of stability. 
To address this issue, let us examine how susceptible the solutions are to catastrophic cancella¬ 
tion [31]. For ease of reference, I will indicate which steps in the pose computation summaries 
of Section 8.5 may be unstable. 

Ullman’s solution computes s in the same way as this paper does, and, as a result, is unstable 
at the same places (see Section 6 ). For instance, precision may be lost if the model or image 
points are nearly collinear when computing the coefficients, a, b, and c, of the biquadratic. (In 
Section 8.5, this is step 1 of Ullman’s solution.) Looking for a moment at Ullman’s computation 
of a and c, it may appear that the computation is stable since there is no addition in Equation 54 
or 56. In actuality, instability is hidden in the initial transformation that aligns the model and 
image with the x axis. 

Given s, Ullman computes the cosines of the angles 0 and cj) and then implicitly uses 
i/l — cos 2 6 and \/l — cos 2 (f> to get their sines. (This is step 4 of Ullman’s solution.) Ei¬ 
ther sine could be inaccurate, however, if cos 0 or cos cj) is very close to one. Fig. 7 shows that 
when this happens one of the vectors emanating from too is nearly parallel to the image plane. 
When the rotation matrix R is computed, inaccuracy in the sines affects the elements ri 2 , ri 3 , 
and v 23 (see Equation 49). Since is affected, when the solution is used to transform an 
unmatched model point, the instability can propagate to points that lie in the plane containing 
the three matched model points, which is not true for the solution in this paper (Section 6 ). 

For Huttenlocher and Ullman’s method in Section 8.5, catastrophic cancellations may occur 
in step 1 when l n, / 12 , / 21 , and I 22 are computed, in step 4 when £q and &2 are computed, and 
in step 8 when sR is computed. Instability in step 4 can affect c\ and C 2 in Equation 49: If c\ is 
near zero, then £q and &2 in Equation 66 also must be near zero. From Equations 63 and 64, £q 
and &2 are computed with additions, and so cancellation can occur if they are small. Similarly, 
if C 2 is near zero, £q must be small as well, and so again cancellation can occur. From Fig. 8 , c\ 
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or C 2 is nearly zero when one of the vectors emanating from the origin (too) is nearly parallel 
to the image plane. 

When 1 11 , / 12 , / 21 , and I 22 are computed (Equations 58-61), the results will be inaccurate if 
the determinant, X\y 2 — U 22 / 1 , i s close to zero, which happens exactly when the model points are 
almost collinear. In addition, if any of /n, / 12 , / 21 , or I 22 is almost zero, then cancellation can 
occur in computing it. There are many pairs of model and image triples that can make one or 
more of l n, / 12 , / 21 , Z 22 close to zero (e.g., I 12 ~ 0 whenever x\ ~ x\ and X 2 ~ U 2 , independent 
°f l/i? Vi, and 2 / 2 ). Furthermore, in step 8 , the additions in computing sri 3 and sr 23 can 
also contribute to instability (see Equation 68 ). Note, however, that the image triangle being 
nearly collinear does not necessarily make the computation unstable. 

In Crimson et al.’s solution, instability may arise in step 5 if cos cj) is almost 1, in step 6 if t 
is 1 and lo is close to £, and in step 7 if cos cj) is near 1 or cos 0 is near 1. As with the solution in 
this paper, these situations occur when the model or image points are nearly collinear or when 
one of the sides of the model triangle that emanates from too is nearly parallel to the image 
plane. Like Ullman’s method and Huttenlocher and Ullman’s methods, however, instability can 
propagate to points inside the plane of the matched model points (in step 7). 

In summary, each of the three previous solutions spreads instability in the pose solution 
to points in the plane of the three matched model points; however, the solution in this paper 
does not. Furthermore, the situations in which instability can arise are the same for Ullman’s 
method, the method of Crimson et al., and the solution in this paper. Specifically, these 
situations are when one of the vectors from mo is parallel to the image, when the model points 
are nearly collinear, and when the image points are nearly collinear. Huttenlocher and Ullman’s 
method is unstable in the first two situations as well, which is expected since in these situations 
the problem is ill conditioned (Section 6 ). In addition, Huttenlocher and Ullman’s method can 
be unstable in many cases where the other methods are not, but may be more stable in the 
case that the image points are nearly collinear. 

10 Conclusion 

The weak-perspective three-point problem is fundamental to many approaches to model-based 
recognition. In this paper, I illustrated the underlying geometry, and then used it to derive 
a new solution to the problem and to explain the various special cases that can arise. In 
particular, the times when there are zero, one, and two solutions are described graphically. 

The new solution is based on the distances between the matched model and image points 
and is used to recover the three-space locations of the model points in image coordinates. From 
the recovered locations, a direct expression for the image location of a fourth model point is 
obtained. In contrast, earlier solutions computed an initial transformation that brought the 
model into image coordinates, and then computed an additional transformation to align the 
matched model points to their corresponding image points. As a result, the solution given here 
should be easier to use, and, for recognition systems that repeat the computation of the model 
pose many times, should be more efficient. 

Another difference with the method presented here is that it makes evident the symmetry 
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of the solution with respect to the ordering of the model and image points. Previous methods 
that are based on the coordinates of the points after some initial transformations make this 
symmetry unclear. 

Furthermore, this paper provides stability analyses for both the new and past solutions, none 
of which had been analyzed for stability previously. Each computation is examined for places 
where precision may be lost. From these places, the geometries that give rise to instability are 
inferred. These geometries are used to distinguish instabilities that arise in situations where 
the problem is ill conditioned, that is, situations where instability is inherent, from ones that 
are due to the particular computation. 

In giving another solution, this paper revisits Ullman’s original biquadratic equation for the 
scale factor, but, in addition, goes on to interpret both solutions to the equation, and to prove 
which one is correct. The false solution is shown to correspond to inverting the roles of the 
model and image points. 

Lastly, the new solution is accompanied by a proof that the solution exists and is unique. 
Of the previous methods, only Huttenlocher and Ullman’s demonstrates this as well, and was 
the first to do so. Such proofs may be useful for gaining insights into related problems as well 
as the problem itself. Even so, since existence and uniqueness have been established, all the 
solutions are valid, and should all be considered when a related problem needs to be solved. 
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A Rigid Transform between 3 Corresponding 3D Points 

This appendix computes a rigid transform between two sets of three corresponding points using 
right-handed coordinate systems built separately on each set of three points. A right-handed 
system is determined by an origin point, o, and three perpendicular unit vectors, ( u,v,w ). 
Given three points in space, po, pi, pi, we can construct a right-handed system as follows: Let 
Poi = Pi ~ Po and p 02 = pi - Po- Then let 

o = Po 
U = Poi 

v = P 02 - (p 02 ■ P 0 l)P 01 

W = U X V 

Let (oi; hi, hi, uq) and (o 2 ; h 2 , w 2 , tb 2 ) be the coordinate systems so defined for the original and 
camera-centered points, respectively. 

Given a coordinate system (o]u,v,w), a rigid transformation that takes a point in world 
coordinates to a point in that coordinate system is given by (R, t), where 

R = [u v w], t = o 


32 



(see for example [6]); the transformed p is R p + t. Then we can bring a point p from the original 
system to the world and then to the camera-centered system using 

R 2 (Ri T (p - Cl)) + t 2 = R 2 Ri t p + h ~ R 2 Rl T f"i 

where 

Rl = [u x vi wi], t x = 01 

R 2 = [U2 v 2 W 2 \, h = o 2 . 

Consequently a rigid transformation (R,f) that aligns the two coordinate systems is 

R = R 2 Ri T , t = t 2 — R 2 Ri T ti. (84) 


B Biquadratic for the Scale Factor 


This appendix shows 

4(s 2 f?Qi — dQ 1 )(s 2 f?Q 2 — d,Q 2 ) 


s 2 (Ri 2 Rqi R02) (^12 ^01 ^02) 


is equivalent to a biquadratic in s. 

Expanding Equation 85, 

4 (s 4 RhRl 2 — s<2 (-^ 01^02 + -^ 02 ^ 01 ) + ^ 01 ^ 02 ) = 

s\Rl 1 + Rl 2 - RI2) 2 ~ 2 s 2 (Rh + R 2 02 - R 2 12 )(d 2 01 + dl 2 - dj 2 ) 
+ (Xoi + dl 2 - d\ 2 f 


s 4 (^:RIiR 1 2 — {Rh + Ro 2 — -R 12 ) 2 ) 

— 2 s 2 ( 2 Rhdl 2 + 2 f?Q 2 doi — (-^01 + Ro 2 — Ri2){dh + ^02 — ^12)) 
+ ( 4 ^ 01^02 _ (X 01 + dl 2 - di 2 ) 2 ) = 0 
as 4 — 26 s 2 + c = 0 , 

where 

a = 4Rl t Rl 2 — (Rh + Rq 2 — R\ 2 ) 2 

b = 2R 2 01 d 2 2 + 2f?Q 2 doi — (4foi + R-l 2 — 4? 2 2 )(doi + dl 2 — d\ 2 ) 

c = 4dl x dl 2 — (dh + dl 2 — d\ 2 ) 2 . 


(85) 
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C Two Solutions for Scale 

This appendix proves Proposition 1. The proof uses the following lemma: 
Lemma : Let / be either ( 7 ^-) or • Then 

af 2 — 2bf + c < 0. 

Proof: 

af -2 bf +c 


= MR01R02 sin /) 2 / 2 - 

2 ( 2 (# 0^02 + -^ 02^01 - 2R 01 R 02 d 01 d 02 cos cp cos /)) / 
4(c?oi^02 sin t/’) 2 , from Equations 24, 25, and 26 


= 4 (R 2 01 Rl 2 {l - cos 2 4>)f 2 - 

(f?Oldo 2 + 4?02^01 — 2 oi4^02^01^02 COS (f) cos ip)f + 

^oi ^ 1 - cos 2 'll’)) 

2 


Suppose that / = ( 7 ^-) • Then 87 becomes 


R 2 d? 


4 — 


02^01 

/? 2 
-"Ol 


cos 2 <f> + 2 


Ro 2 dQ 1 do 2 


cos (f> cos ip — dQ t d,Q 2 cos 2 ip 


47, 


01 


at , 2 J2 1 *4oi , d 02 

= —4-R 0 2«oi ( -5— cos <4 - -5— cos V 
,-Koi -K 02 


Suppose instead that / = 


— I do?. 


. Then 87 becomes 


4 - 


4^01 *4(32 2 / 1 O 42oi^02^01 , , ,2 j2 2 ; 

""" cos (p cos tp — d 01 d 02 cos ip 


p 2 

^02 


cos 2 (p + 2 - 


47, 


02 


, Dl ! « , d 02 , (4oi , 

= —4i7 0 i«o2 ( -5— cos cp - —— cos ip 
,Jt 02 -H -01 


Either way, a / 2 — 26/ + c < 0. 
□ 

Proposition 1: Let 


Si = 


I b — \Jb 2 — ac 


S 2 = 


I b + \/6 2 — ac 


Then 


(4 0 i d 02 

Si < -,- < So. 

1 — n ^ p — " 

ft 01 ftQ2 


( 86 ) 


(87) 


34 



Proof: Starting from the result of the lemma, 


af 2 — 2b f + c < 0 
— [{af — b ) 2 — ( b 2 — ac)j < 0 


{af — b) 2 < b 2 — ac, since a > 0 
|af — b\ < \Jb 2 — ac 

— {af — b) < y/d 2 — ac and af — b < \/b 2 — 
b — \/b 2 — ac b + \/b 2 — 


ac 


f> 


and / < 


ac 


4< f <4 

d 0 i d 02 

Si < -, - < So 

1 — 70 ’ 70 — 

-KOI rl02 


□ 

D One Solution for Scale 

In the “one solution” case, we wish to know when and if b 2 — ac = 0 holds. Using the result of 
Appendix F, this means that 

4 (f?oido 2) 4 (t 2 — 2 cos(^) + ip)t + (t 2 — 2 cos(^) — ip)t + 1 ^ = 0 . 

For this to hold, either 

t 2 — 2 cos {(f> + ip)t +1 = 0 or t 2 — 2 cos(cp — ip)t +1 = 0 . 

Solving for t gives 

t = cos {(f) + ip) ± i sin(^> + ip) or t = cos{cp — ip) ± i sin(^> — ip), (88) 

where i = yj— 1. Consequently, there are real values of t that make b 2 — ac = 0 only if 
sin(()> + ip) = 0 or sin(()> — ip) = 0. These situations occur when cp = ±ip and cp = ±ip + 7 r. 
Substituting into Equation 88 gives that b 2 — ac = 0 iff both cp = ±ip or cp = ±ip + 7 r and 1 = 1, 
where t = 1 is the same as = -If 12 -. 

JT01 JT02 
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E No Solutions for Scale 


This appendix shows that there always exists a solution to the biquadratic by showing that 
b 2 — ac > 0 . From Appendix F, 

b 2 — ac = 4(i?oido2) 4 (t 2 — 2 cos(0 + 0)2 + 1^ (t 2 — 2 cos(0 — 0)2 + 1^ 

> 4 (i?oido 2 ) 4 (t 2 — 22 + 1^ (t 2 — 2t + 1^ 

= 4 (i?oido 2 ) 4 (t — l) 4 

> 0 

F Simplifying b 2 — ac 

In this appendix, I derive that 

b 2 — ac = 4(i?oido2) 4 (t 2 — 2 cos (0 + 0)2 + 1 ^ (t 2 — 2 cos (0 — 0)2 + 1 ^ , (89) 

where 

^ _ Rmdpi 
Roido2 

From Equations 24 , 25 , and 26 , 
a = 4(i?oii?o2 sin 0) 2 

b = 2(i?Q 1 C?Q2 + -Ro 2^01 — 2i?oi-Ro2^01^02 COS 0 cos Ip) 

c = 4(d 0 id 0 2 sin-0) 2 

Then 

6 2 = 4(Ro 2 doi ~ 4i?o2^oi-Roi^o2 cos 0 cos ip + 2Rl 1 Rl 2 dl 1 dl 2 + 

4:RoiRo2doido 2 cos 2 0 cos 2 ip — 422 ^do 2 -Ro2^oi cos 0 cos ip + -^01^02) 

ac = 1612^1202^01^02 sin 2 4* sin 2 V’ 


b 2 — ac = 4 ^i2o 2 do! — 4i2o 2 do 1 i2oido2 cos ()> cos 

(2 + 4 cos 2 0 cos 2 -0 — 4 sin 2 0 sin 2 0)i2o 1 i2o 2 do 1 do 2 ~ 

4RQ 1 dQ 2 R 0 2d 0 i cos 0 cos ip + 

= 4(i2oid 0 2) 4 ^ 2 4 — 4 cos 0 cos 02 3 + (2 + 4 cos 2 0 cos 2 -0 — 4 sin 2 0 sin 2 0 ) 2 2 — 

4 cos 0 cos 02 + 1), where 2 = 02 01 

-f2oi«02 
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= 4(^01 d 0 2) 4 (f 4 - 2 (cos(p + ip) + COS ((f) - lp))t 3 + 

(2+4 cos (4> + ip) cos(<p — ip))t 2 — 2 (cos (<p + ip) + cos(<(> — -0))t + 1^ 

= 4(_/Z 0 ic?o 2 ) 4 (t 2 — 2 cos(^ + ip)t + 1^ ^/ 2 — 2 cos(<(> — ip)t + 1^ 
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